#!/bin/bash
#
# Copyright (c) 2020-2021 NVI, Inc.
#
# This file is part of FSL10 Linux distribution.
# (see http://github.com/nvi-inc/fsl10).
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#

set -e

usage() {
    cat << EOF

This script refreshes the contents of the Spare system disk '/usr2' partition
if you have both Operational and Spare systems.  Rebooting afterwards is
recommended.

IMPORTANT: This script will overwrite everything on the '/usr2' partition of
           the system it is run on.  Use it only on the Spare system.

Usage: $short_me [options]

OPTIONS

 -F       run in Foreground
 -h       help: this output
 -i       ignore file modifications times, forcing ALL files to be updated
 -I       Initialize local '/usr2' and then copy ALL files and directories
 -N       Nothing: this option is used internally and has no effect
 -s       suppress sending email
 -Y       Yes: proceed without prompting for confirmation

The '-i' option should only be rarely needed.  It is recommended to NOT use
the uppercase options except in special circumstances.

DESCRIPTION

This script must be executed by 'root' on the Spare system.  You must be
logged-in as 'root' or promote to 'root' using 'su' or 'sudo'/'root_account'.
It is best if all other sessions are logged-out of both systems and there be no
activity on either system's '/usr2'.  See the 'USING THE INITIALIZATION OPTION'
section below for restrictions that apply when the '-I' option is used.

The script creates a log file of what happened during the refresh operation in
'/root/refresh_spare_usr2_logs' directory. The log file name includes a
time-stamp of when the refresh was started.

An email is sent to 'root' with the log file contents when the script finishes.

If the script is aborted while it is running, it is usually possible to recover
by simply rerunning it.  An important exception to this when using the '-I'
option is described, with recovery steps, in the CRITICAL PHASE section below.

The script may take about three times longer to complete if both the
Operational and Spare systems are refreshing their secondary disks than when
the systems are quiescent.

MODES

The script has two modes, background, which is the default, and foreground
using the '-f' option.

For background refreshes, verify that it has finished, either from the log file
or an email, before performing the finishing-up steps: checking that 'mdstat'
does not show an active recovery and then rebooting.  Those finishing-up steps
are also needed for foreground mode, but it should be apparent from the screen
output when the refresh has finished.

For background refreshes, the overall 'rsync' transfer statistics and final
overall time required are listed in the log file.

For foreground refreshes, a progress meter will be displayed on the terminal.
The two leftmost fields are the cumulative bytes transferred and the overall
completion percentage.  The overall 'rsync' transfer statistics are not listed,
but the final overall time is still listed.

USING THE SYSTEMS WHILE THE SCRIPT IS RUNNING

It is best to NOT login to an account on the Spare system with a home directory
on the '/usr2' partition (logging in as 'root' on a text console is okay) or
otherwise access that partition during the refresh.  Accounts with home
directories on the Spare system '/usr2' partition may not be fully configured
until after the refresh completes.

It is possible to use the Operational system during the refresh if necessary,
but this should be avoided if possible and activity on the '/usr2' partition
should be as limited as possible.  You should not expect any changes on the
Operational system '/usr2' that occur after the refresh starts to be propagated
to the Spare system.  If any files are deleted before they can be transferred,
there will be a warning 'file has vanished: "..."' for each such file, where
'...' is the file name, and there will be a summary warning that starts with
'rsync warning: some files vanished before they could be transferred', but
without additional warnings or errors, the transfer should otherwise be
successful.

USING THE INITIALIZATION OPTION

When using the '-I' option, it is recommended that you login on a text console
as 'root'.  Alternatively, you could login as a non-'root' user either on a
console or by 'ssh' from another system (preferably not the Operational
system).  For either approach as a non-'root' user, you must make sure your
working directory is not on '/usr2', for example by using 'cd /tmp', before
running the script.  It is best to get off '/usr2' before promoting to 'root'.

WARNING: All processes that are using the '/usr2' partition will be terminated.

In addition, if you use the -f' option to refresh in foreground, you must NOT
have originally logged into an X session of an account that has its home
directory on '/usr2' NOR have any ancestor (parent) process of your shell with
its working directory on '/usr2' (if you logged in with 'ssh', this includes
NOT working in an 'xterm' that you started before moving your working directory
off '/usr2').  A refresh will work for both of these cases without the '-f'
option, but you will be automatically logged out while the refresh continues in
background.

Please see the CRITICAL PHASE section below information on recovering from the
script being aborted while using the '-I' option.

CRITICAL PHASE

In many cases, if the script is unexpectedly aborted when using the '-I'
option, it is possible to recover simply by rerunning the script.  An important
exception to this is if the abort occurs in the 'critical phase'.

The critical phase of operation occurs between the printing of

    Starting critical phase. Executing "umount /usr2" ...

and

    ... completed "mount /usr2". The critical phase has finished.

If the script is interrupted or otherwise fails between these outputs, it will
usually be necessary to take corrective action.  This normally consists of
re-executing (using copy-and-paste) the command (everything from between the
double quotes, typically a long string that will probably overwrap your
terminal width) shown in the output line

    Executing "mke2fs -F -t ..."

That line is printed during the critical phase and can be found in the log
file, or if the '-f' option was used, may be visible on the screen.  If the
re-execution of 'mke2fs -F -t ...' fails because '/usr2' is mounted, you will
need to execute 'umount /usr2' before trying again.  After 'mke2fs' has been
re-executed successfully, you can execute 'mount /usr2' and try the script
again.

INSTALLATION

This script must be customized before it is used.  The installation
instructions are included in comments in the script.

EOF
}

short_me="$(basename $0)"
background=1
proceed=
send_email=1
initialize=
ignore=
while getopts FhiINsY opt; do
    case $opt in
        F)
            background=
            ;;
        h)
            usage
            exit
            ;;
        i)
            ignore=1
            ;;
        I)
            initialize=1
            ;;
        N)
            ;;
        s)
            send_email=
            ;;
        Y)
            proceed=1
            ;;
        *)
            echo "Use '$short_me -h' for help" >&2
            exit 1
            ;;
        :)
            echo "Use '$short_me -h' for help" >&2
            exit 1
          ;;
    esac
done
shift $((OPTIND - 1))

special=special_arg_to_run_sub-script
subscript=
if [[ "$1" = "$special" ]]; then
  subscript=1
fi

log=$2

if [[ -z "$subscript" ]]; then

    if [[ -n "$initialize" && "$PWD/" =~ ^/usr2/ ]]; then
       echo "You are on '/usr2'. Try again, but 'cd /tmp' first."
       exit 1
    fi

    if ! [ ${EUID:-`id -u`} = 0 ]; then
       echo "You are not running with 'root' privileges!"
       exit 1
    fi

    if [[ -z "$proceed" ]]; then

        cat << EOF

    IMPORTANT: It is best if no other sessions are logged in on this system and
               no sessions should be logged in on the Operational system.

EOF

        echo -e "Is it okay to proceed with refreshing '/usr2'? (y=yes, n=no) : \c"
        badans=true
        while [ "$badans" = "true" ]
        do
          read ans
          case "$ans" in
            y|yes) echo -e "\nO.K. Continuing with refresh ... "
                   badans=false
                   ;;
            n|no)  echo -e "\nO.K. Exiting."
                   exit
                   ;;
            *)     echo -e "\nPlease answer with y=yes or n=no : \c"
          esac
        done
    fi

#--------------------- Comment out the next two lines ---------------------
echo "This script must be customized before use.  See script for details."
exit 1
#--------------------------------------------------------------------------
#
# To customize this script (as 'root') on the Spare system:
#   1. MAKE ABSOLUTELY SURE YOU ARE WORKING ON THE SPARE SYSTEM.
#   2. Edit '/usr/local/sbin/refresh_spare_usr2':
#        A. Comment out the two commands above (add leading '#'s).
#        B  Change the 'operational' in the 'remote_node=operational' line
#           below to the alias (preferred), FQDN, or IP address of your
#           Operational system.
#        C. For CIS hardened systems only: uncomment the line (remove the
#           leading '#') '#remote_user=spare'.
#        D. Save the result.
#   3. Follow the other directions in the 'Installing refresh_spare_usr2'
#      subsection of the 'RAID Notes for FSL10' document.
#      The CIS hardening approach may be a useful alternative, or necessary if
#      you have a CIS hardened system.  For more information, see the
#      'Installing refresh_spare_usr2 with CIS hardening' subsection of the
#      'CIS Hardening for FSL10' document.
#   4. Test it the first time just after the Spare system's disk has been
#      rotated.

# setup for log file
    dir=/root/refresh_spare_usr2_logs
    if [[ ! -d  "$dir" ]]; then
       mkdir "$dir"
    fi
    log="$dir/$(date +"%Y.%b.%d.%H.%M.%S")".log

#set most options for sub-script
    options=N
    if [[ -z "$send_email" ]]; then
         options="${options}s"
    fi
    if [[ -n "$ignore" ]]; then
        options="${options}i"
    fi
    if [[ -n "$initialize" ]]; then
        options="${options}I"
    fi

    if [[ -n "$background" ]]; then
        cat << EOF

    Going to background, you should log out now or you may be logged out
    automatically.

    It is best to NOT login to either the Spare or Operational system while
    'refresh_spare_usr2' is running, but see the 'USING THE SYSTEMS WHILE THE
    SCRIPT IS RUNNING' section of the '-h' help output for more details.

    After the refresh has finished, wait until 'mdstat' does not show an active
    recovery and then reboot.

EOF
        if [[ -n "$send_email" ]]; then
            cat << EOF
An email with the log file will be sent to 'root' when the script finishes.

EOF
        fi
        echo "Log file is $log"
        echo ""
        nohup "$0" "-$options" "$special" "$log" 1>>$log 2>&1 &
    else
        echo ""
        echo "Log file is $log"
        echo ""
        "$0" "-${options}F" "$special" "$log" 2>&1 | tee "$log"
    fi
    exit
fi

if [[ -z "$background" ]]; then
    echo "Refreshing Spare system disk '/usr2' partition in foreground ..."
else
    echo "Refreshing Spare system disk '/usr2' partition in background ..."
fi
echo "Started at $(date)"
echo "[$short_me 3.0]"

if [[ -n "$initialize" ]]; then

    if [ -z "`cat /proc/mounts | cut -d ' ' -f2 | grep /usr2`" ]; then
     echo ""
     echo "Can't proceed, '/usr2' is not mounted"
     exit 1
    fi

    device=$(grep " /usr2 " /proc/mounts | cut -d' ' -f1)
    type=$(grep " /usr2 " /proc/mounts | cut -d' ' -f3)

    if [[ -z "$background" ]]; then
        cat << EOF

        IMPORTANT: If you are logged out automatically, the refresh FAILED.  You
                   were probably on '/usr2' before you promoted to 'root'.  Try
                   again, but 'cd /tmp' before promoting to 'root'.

EOF
    fi

    fuser -k -M -m /usr2 || :
    sleep 1

    if [[ -n "$background" ]]; then
        echo ""
    fi
    echo "Starting critical phase. Executing \"umount /usr2\" ..."
    echo ""
    umount /usr2
    UUID=$(tune2fs -l $device | grep UUID | grep -o '[^ ]*$')
    echo "Executing \"mke2fs -F -t $type -U $UUID $device\""
    echo ""
    mke2fs -F -t $type -U $UUID $device
    mount /usr2
    echo "... completed \"mount /usr2\".  The critical phase has finished."
fi

if [[ -z "$background" ]]; then
# keep foreground messages in order, this delay may not always be long enough
    sleep 0.1

    cat >/dev/tty << EOF

    You may see a login message from your Operational system.  If you are
    prompted for the 'root' or other password on that system, you should
    provide it.  Then progress information from 'rsync' will be displayed.

    It is best to NOT login to either the Spare or Operational system while
    'refresh_spare_usr2' is running, but see the 'USING THE SYSTEMS WHILE THE
    SCRIPT IS RUNNING' section of the '-h' help output for more details.

    When the transfer finishes, the elapsed time of the transfer will be
    printed and you will see a line that starts: 'Done. ...', and get a prompt.
    Please follow the directions on the 'Done. ...' line.

EOF
    if [[ -n "$send_email" ]]; then
        cat >/dev/tty << EOF
    An email with the log file will be sent to 'root' when the script finishes.

EOF
    fi
fi

remote_user=root

# Use your Operational system name in place of 'operational' in the next line
remote_node=operational

# Uncomment the next line for CIS hardened systems
#remote_user=spare

rsync_options=-aH
if [[ -n "$ignore" ]]; then
    rsync_options="${rsync_options}I"
fi

if [[ -z "$background" ]]; then
    time rsync "$rsync_options" --delete --no-i-r --info=progress2 "$remote_user@$remote_node:/" /usr2 >/dev/tty
else
    time rsync "$rsync_options" --delete --stats --human-readable "$remote_user@$remote_node:/" /usr2
fi

cat << EOF

Done. Wait until 'mdstat' does not show an active recovery and then reboot.
EOF

if [[ -z "$background" ]]; then
    if [[ -n "$send_email" ]]; then
        cat "$log" | /usr/bin/mail -s "foreground refresh_spare_usr2 finished" root &
    fi
    echo "" >/dev/tty
elif [[ -n "$send_email" ]]; then
    cat "$log" | /usr/bin/mail -s "background refresh_spare_usr2 finished" root
fi
